Sketching for Principal Component Regression

نویسندگان

  • Liron Mor-Yosef
  • Haim Avron
چکیده

Principal component regression (PCR) is a useful method for regularizing linear regression. Although conceptually simple, straightforward implementations of PCR have high computational costs and so are inappropriate when learning with large scale data. In this paper, we propose efficient algorithms for computing approximate PCR solutions that are, on one hand, high quality approximations to the true PCR solutions (when viewed as minimizer of a constrained optimization problem), and on the other hand entertain rigorous risk bounds (when viewed as statistical estimators). In particular, we propose an input sparsity time algorithms for approximate PCR. We also consider computing an approximate PCR in the streaming model, and kernel PCR. Empirical results demonstrate the excellent performance of our proposed methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An application of principal component analysis and logistic regression to facilitate production scheduling decision support system: an automotive industry case

Production planning and control (PPC) systems have to deal with rising complexity and dynamics. The complexity of planning tasks is due to some existing multiple variables and dynamic factors derived from uncertainties surrounding the PPC. Although literatures on exact scheduling algorithms, simulation approaches, and heuristic methods are extensive in production planning, they seem to be ineff...

متن کامل

Sketching for Kronecker Product Regression and P-splines

TensorSketch is an oblivious linear sketch introduced in (Pagh, 2013) and later used in (Pham and Pagh, 2013) in the context of SVMs for polynomial kernels. It was shown in (Avron et al., 2014) that TensorSketch provides a subspace embedding, and therefore can be used for canonical correlation analysis, low rank approximation, and principal component regression for the polynomial kernel. We tak...

متن کامل

A Radhika: Effective Summary for Massive Data Set

The research efforts attempt to investigate size of the data increasing interest in designing the effective algorithm for space and time reduction. Providing high-dimensional technique over large data set is difficult. However, Randomized techniques are used for analyzing the data set where the performance of the data from part of storage in networks needs to be collected and analyzed continuou...

متن کامل

Surface EMG-based Sketching Recognition Using Two Analysis Windows and Gene Expression Programming

Sketching is one of the most important processes in the conceptual stage of design. Previous studies have relied largely on the analyses of sketching process and outcomes; whereas surface electromyographic (sEMG) signals associated with sketching have received little attention. In this study, we propose a method in which 11 basic one-stroke sketching shapes are identified from the sEMG signals ...

متن کامل

Predicting the Young\'s Modulus and Uniaxial Compressive Strength of a typical limestone using the Principal Component Regression and Particle Swarm Optimization

In geotechnical engineering, rock mechanics and engineering geology, depending on the project design, uniaxial strength and static Youngchr('39')s modulus of rocks are of vital importance. The direct determination of the aforementioned parameters in the laboratory, however, requires intact and high-quality cores and preparation of their specimens have some limitations. Moreover, performing thes...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018